Pesquisa | Portal Regional da BVS

A pancreatic cancer risk prediction model (Prism) developed and validated on large-scale US clinical data.

Jia, Kai; Kundrot, Steven; Palchuk, Matvey B; Warnick, Jeff; Haapala, Kathryn; Kaplan, Irving D; Rinard, Martin; Appelbaum, Limor.

EBioMedicine ; 98: 104888, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-38007948

RESUMO

BACKGROUND: Pancreatic Duct Adenocarcinoma (PDAC) screening can enable early-stage disease detection and long-term survival. Current guidelines use inherited predisposition, with about 10% of PDAC cases eligible for screening. Using Electronic Health Record (EHR) data from a multi-institutional federated network, we developed and validated a PDAC RISk Model (Prism) for the general US population to extend early PDAC detection. METHODS: Neural Network (PrismNN) and Logistic Regression (PrismLR) were developed using EHR data from 55 US Health Care Organisations (HCOs) to predict PDAC risk 6-18 months before diagnosis for patients 40 years or older. Model performance was assessed using Area Under the Curve (AUC) and calibration plots. Models were internal-externally validated by geographic location, race, and time. Simulated model deployment evaluated Standardised Incidence Ratio (SIR) and other metrics. FINDINGS: With 35,387 PDAC cases, 1,500,081 controls, and 87 features per patient, PrismNN obtained a test AUC of 0.826 (95% CI: 0.824-0.828) (PrismLR: 0.800 (95% CI: 0.798-0.802)). PrismNN's average internal-external validation AUCs were 0.740 for locations, 0.828 for races, and 0.789 (95% CI: 0.762-0.816) for time. At SIR = 5.10 (exceeding the current screening inclusion threshold) in simulated model deployment, PrismNN sensitivity was 35.9% (specificity 95.3%). INTERPRETATION: Prism models demonstrated good accuracy and generalizability across diverse populations. PrismNN could find 3.5 times more cases at comparable risk than current screening guidelines. The small number of features provided a basis for model interpretation. Integration with the federated network provided data from a large, heterogeneous patient population and a pathway to future clinical deployment. FUNDING: Prevent Cancer Foundation, TriNetX, Boeing, DARPA, NSF, and Aarno Labs.

Assuntos

Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Carcinoma Ductal Pancreático/patologia , Modelos Logísticos , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/epidemiologia , Neoplasias Pancreáticas/etiologia , Estudos Retrospectivos , Estudos Multicêntricos como Assunto

Development and validation of a pancreatic cancer risk model for the general population using electronic health records: An observational study.

Appelbaum, Limor; Cambronero, José P; Stevens, Jennifer P; Horng, Steven; Pollick, Karla; Silva, George; Haneuse, Sebastien; Piatkowski, Gail; Benhaga, Nordine; Duey, Stacey; Stevenson, Mary A; Mamon, Harvey; Kaplan, Irving D; Rinard, Martin C.

Eur J Cancer ; 143: 19-30, 2021 01.

Artigo em Inglês | MEDLINE | ID: mdl-33278770

RESUMO

AIM: Pancreatic ductal adenocarcinoma (PDAC) is often diagnosed at a late, incurable stage. We sought to determine whether individuals at high risk of developing PDAC could be identified early using routinely collected data. METHODS: Electronic health record (EHR) databases from two independent hospitals in Boston, Massachusetts, providing inpatient, outpatient, and emergency care, from 1979 through 2017, were used with case-control matching. PDAC cases were selected using International Classification of Diseases 9/10 codes and validated with tumour registries. A data-driven feature selection approach was used to develop neural networks and L2-regularised logistic regression (LR) models on training data (594 cases, 100,787 controls) and compared with a published model based on hand-selected diagnoses ('baseline'). Model performance was validated on an external database (408 cases, 160,185 controls). Three prediction lead times (180, 270 and 365 days) were considered. RESULTS: The LR model had the best performance, with an area under the curve (AUC) of 0.71 (confidence interval [CI]: 0.67-0.76) for the training set, and AUC 0.68 (CI: 0.65-0.71) for the validation set, 365 days before diagnosis. Data-driven feature selection improved results over 'baseline' (AUC = 0.55; CI: 0.52-0.58). The LR model flags 2692 (CI 2592-2791) of 156,485 as high risk, 365 days in advance, identifying 25 (CI: 16-36) cancer patients. Risk stratification showed that the high-risk group presented a cancer rate 3 to 5 times the prevalence in our data set. CONCLUSION: A simple EHR model, based on diagnoses, can identify high-risk individuals for PDAC up to one year in advance. This inexpensive, systematic approach may serve as the first sieve for selection of individuals for PDAC screening programs.

Assuntos

Adenocarcinoma/epidemiologia , Carcinoma Ductal Pancreático/epidemiologia , Registros Eletrônicos de Saúde/normas , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes , Projetos de Pesquisa

Rapid haplotype inference for nuclear families.

Williams, Amy L; Housman, David E; Rinard, Martin C; Gifford, David K.

Genome Biol ; 11(10): R108, 2010.

Artigo em Inglês | MEDLINE | ID: mdl-21034477

RESUMO

Hapi is a new dynamic programming algorithm that ignores uninformative states and state transitions in order to efficiently compute minimum-recombinant and maximum likelihood haplotypes. When applied to a dataset containing 103 families, Hapi performs 3.8 and 320 times faster than state-of-the-art algorithms. Because Hapi infers both minimum-recombinant and maximum likelihood haplotypes and applies to related individuals, the haplotypes it infers are highly accurate over extended genomic distances.

Assuntos

Algoritmos , Biologia Computacional/métodos , Haplótipos , Bases de Dados Genéticas , Conversão Gênica , Loci Gênicos , Genótipo , Humanos , Funções Verossimilhança , Modelos Genéticos , Núcleo Familiar , Linhagem , Polimorfismo de Nucleotídeo Único , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA